Estimation of Perceptual Spaces for Speaker Identities Based on the Cross-Lingual Discrimination Task
نویسندگان
چکیده
This paper reconfirms that talker identity can be transmitted across languages. Talker discrimination was examined in the ABX paradigm, where the stimuli A and B were utterances by different talkers in the same language and the stimulus X was an utterance by either of A or B in the different language. The average hit rate of this discrimination task was as high as 0.89. The mutual distance matrices were generated using the discrimination index, ′ d . By applying the multidimensional scaling, three-dimensional perceptual spaces were estimated. The features related with loudness and spectral centroid had high contribution to the perceptual dimensions.
منابع مشابه
Cross-lingual talker discrimination
This paper describes a talker discrimination experiment in which native English listeners were presented with two sentences spoken by bilingual talkers (English/German and English/Finnish) and were asked to judge whether they thought the sentences were spoken by the same person or not. Equal amounts of cross-lingual and matched-language trials were presented. The experiments showed that listene...
متن کاملAn analysis of language mismatch in HMM state mapping-based cross-lingual speaker adaptation
This paper provides an in-depth analysis of the impacts of language mismatch on the performance of cross-lingual speaker adaptation. Our work confirms the influence of language mismatch between average voice distributions for synthesis and for transform estimation and the necessity of eliminating this mismatch in order to effectively utilize multiple transforms for cross-lingual speaker adaptat...
متن کاملCross-lingual Speaker Adaptation for HMM-based Speech Synthesis based on Perceptual Characteristics and Speaker Interpolation
This paper proposes a cross-lingual speaker adaptation (CLSA) method based on perceptual characteristics (PCs). To develop a CLSA system, a state mapping (SM) based method has been recently proposed. This method extracts speaker characteristics directly from source acoustic features as linear transforms and applies them to target models. However, it is difficult to completely eliminate language...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کامل